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DETAILED ACTION 

Specification 

1 . The title of the invention is not descriptive. A new title is required that is clearly 
indicative of the invention to which the claims are directed. 

2. The disclosure is objected to because of the following informalities: The inventor 
did not submit a summary for the application. 

Appropriate correction is required. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 1, 5, 7, 9-11, 15, 18, 19 and 20 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Rothweiler et al. (U.S. Pat. 5,668,925) in view of Nishiguchi et 
al. (U.S. Pat. 6,047,253), cited by the applicant.. 

As per claim 1, Rothweiler teaches a method of coding speech comprising the 
steps of: 

sampling a speech signal (A/D converter, Fig. 1, element 14); 

determining a pitch of the speech signal (pitch epoch detector, col. 8, lines 11- 

13); 
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characterizing the voiced quality of the speech signal (determination of the signal 
being voiced or unvoiced, col. 9, lines 62-67 and col. 10, lines 8-12); 

implied training a Lloyd-Max quantizer (col. 13, lines 30-32); and 

quantizing the pitch values (pitch quantized by differential quantizer, col. 17, lines 
37-47) from the training step and the pitch values of those speech signals from the 
determining step not characterized as being substantially fully voiced in the 
characterizing step (determines the pitch for all voice segments hence the averaging 
would take into account all voiced and unvoiced segments, col. 10, lines 23-36). 

Rothweiler does not teach training the Lloyd-Max quantizer using the pitch values 
of those speech signals from the determining step characterized as being substantially 
fully voiced in the characterizing step. 

However, the Examiner takes Official Notice that it is common in the art to train a 
quantizer on the type of data that it would quantize and that unvoiced speech segments 
do not carry any pitch information. Therefore, it would have been obvious to one of 
ordinary skill in the art at the time of invention to modify the system of Rothweiler to train 
the Lloyd-Max quantizer using the pitch values of those speech signals from the 
determining step characterized as being substantially fully voiced in the characterizing 
step because it would give a method for quantizing the meaningful pitch value with a 
low error rate, hence improving decoded pitch value accuracy. 

Rothweiler does not teach speech coding using perceptual weighting. 

Nishiguchi teaches the speech coder using perceptual weighting (perceptually 
weighted filter, Fig. 1, element 125). 
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It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler so that the speech coder uses perceptual 
weighting as taught by Nishiguchi because it would lower the quantization noise below 
the level of human perception hence making the system more robust. 

5. As per claim 5, Rothweiler teaches the sampling step does not use error 
correction (does not mention use of error correction in sampling, col. 6, lines 45-62, or 
anywhere else). 

6. As per claim 7, Rothweiler teaches storing the quantized pitch values in a 
memory for later decoding, synthesis and playback (encoded pitch signal sent to buffer, 
col. 17, lines 45-47). 

7. As per claim 9, Rothweiler teaches 

determining a gain of the speech signal (signal power, col. 7, lines 26-30 and col. 
13, line 13), 

the implied training step includes training a Lloyd-Max quantizer with the gain 
values (gain quantizer uses Lloyd-max algorithm, Fig. 3 and col. 13, lines 30-32), and 

the quantizing step includes quantizing the gain values from the implied training 
step and the gain values of those speech signals from the determining step not 
characterized as being substantially fully voiced in the characterizing step.(all frames 
are quantized, col. 13, lines 39-51). 

Rothweiler does not teach training the quantizer with speech signals from the 
determining step characterized as being substantially fully voiced in the characterizing 
step. 
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However, the Examiner takes Official Notice that using only voiced speech 
signals in training for gain quantization is common in the art. Therefore, it would have 
been obvious to one of ordinary skill in the art at the time of invention to modify the 
system of Rothweiler to train training the quantizer with speech signals from the 
determining step characterized as being substantially fully voiced in the characterizing 
step because it would be more important to get the gain right for voiced portions to 
avoid "zipper noise" due to volume variations. 

8. As per claims 10 and 19, Rothweiler teaches the fully voiced speech signal is 
synthesized using a pitch periodic excitation train (periodic signals are generated by a 
pitch generator and supplied to the filter, col. 19, lines 2-8). 

Rothweiler does not teach that the speech that is not fully voiced is synthesized 
using a lowpass filtered pitch periodic excitation signal mixed with highpass white noise. 

Nishiguchi teaches a method for synthesizing speech that mixes a lowpass 
filtered pitch periodic excitation signal (voiced sound filtered by post-filter, col. 10, lines 
40-44) with a highpass white noise (unvoiced section filters noise codebook vectors to 
combine with the pitch excitation train (Fig. 4, element 220). . 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler so that speech that is not fully voiced is 
synthesized using a lowpass filtered pitch periodic excitation signal mixed with highpass 
white noise as taught by Nishiguchi because partially voiced speech would need the 
combination of a periodic pulse train and noise to reconstruct the signal accurately. 
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9. As per claims 1 1 and 20, Rothweiler teaches pitch periodic excitation trains with 
substantially flat spectral response (the pitch pulse train excitating pulse width would 
tend to be short so that its spectral response would tend to be flat, col. 19, lines 3-8). 

1 0. As per claim 1 5, Rothweiler teaches an apparatus for coding speech comprising: 
a buffer, the buffer inputs a speech signal and stores samples thereof (sampler 

would necessarily have a buffer, col. 7, lines 9-17); 

a pitch detector coupled to the buffer, the pitch detector determines a pitch of the 
speech signal (pitch of the signal, col. 10, lines 23-25); 

a voicing analyzer coupled to the pitch detector; the voicing analyzer 
characterizes the speech signal as to whether it is substantially fully voiced 
(determination of the signal being voiced or unvoiced, col. 9, lines 62-67 and col. 10, 
lines 8-12); and 

a Lloyd-Max quantizer that is necessarily trained with the input speech signal 
(col. 13, lines 30-32), the quantizer also quantizes the pitch values of those speech 
signals from the pitch detector not characterized as being substantially fully voiced 
(determines the pitch for all voice segments would take into account all voiced and 
unvoiced segments, col. 10, lines 23-36). 

Rothweiler does not teach training the Lloyd-Max quantizer using the pitch values 
of those speech signals from the determining step characterized as being substantially 
fully voiced in the characterizing step. 

However, the Examiner takes Official Notice that it is common in the art to train a 
quantizer on the type of data that it would quantize and that unvoiced speech segments 
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do not carry any pitch information. Therefore, it would have been obvious to one of 
ordinary skill in the art at the time of invention to modify the system of Rothweiler to train 
the Lloyd-Max quantizer using the pitch values of those speech signals from the 
determining step characterized as being substantially fully voiced in the characterizing 
step because it would give a method for quantizing the meaningful pitch value with a 
low error rate, hence improving decoded pitch value accuracy. 

Rothweiler does not teach speech coding using perceptual weighting. 

Nishiguchi teaches the speech coder using perceptual weighting (perceptually 
weighted filter, Fig. 1, element 125). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler so that the speech coder uses perceptual 
weighting as taught by Nishiguchi because it would lower the quantization noise below 
the level of human perception hence making the system more robust. 
11. As per claim 18, Rothweiler teaches 

A gain detector coupled between the buffer and quantizer (half frame power 
block, Fig. 2a, element 216), 

a Lloyd-Max quantizer is trained with the gain values (gain quantizer uses Lloyd- 
max algorithm for implied training, col. 13, lines 30-32), and 

the quantizer also quantizes the gain values from the training step and the gain 
values of those speech signals from the determining step not characterized as being 
substantially fully voiced in the characterizing step.(all frames are quantized, col. 13, 
lines 39-51). 
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Rothweiler does not teach training the quantizer with speech signals from the 
determining step characterized as being substantially fully voiced in the characterizing 
step. 

However, the Examiner takes Official Notice that using only voiced speech 
signals in training for gain quantization is common in the art. Therefore, it would have 
been obvious to one of ordinary skill in the art at the time of invention to modify the 
system of Rothweiler to train training the quantizer with speech signals from the 
determining step characterized as being substantially fully voiced in the characterizing 
step because it would be more important to get the gain right for voiced portions to 
avoid "zipper noise" due to volume variations. 

12. Claims 2 and 16 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Rothweiler in view of Satyamurti et al. (U.S. Pat. 5,699,404). 

Rothweiler teaches that the errors include pitch doubling, a common error found 
in pitch estimation (col. 1, lines 45-47), but teaches pitch averaging rather than median 
filtering the pitch values of those speech signals characterized as being substantially 
fully voiced in the characterizing step, thereby removing these pitch-doubling errors. 

Satyamurti teaches median filtering the pitch values of those speech signals 
characterized as being substantially fully voiced in the characterizing step, thereby 
necessarily removing pitch doubling errors (pitch values found from voiced speech 
blocks and are median filtered to eliminate errors, col. 3, lines 1 1-20). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the averaging system of Rothweiler to median filtering of the pitch 
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values of those speech signals characterized as being substantially fully voiced in the 
characterizing step, for removing pitch doubling errors as taught by Satyamurti because 
pitch doubling errors would cause greater error in arithmetic mean than in a median and 
this filtering would reduce the amount of pitch error. 

13. Claims 3 and 4 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Rothweiler in view of Nishiguchi '253 and in further view of Nishiguchi et al. (U.S. Pat. 
5,960,388). 

As per claim 3, Rothweiler teaches dividing the speech signal into a plurality of 
frequency spectrum bands (filter bank, col. 10, lines 37-40). 

Rothweiler and Nishiguchi '253 do not teach establishing the voiced quality of the 
speech signal in each spectrum band, and describing the speech signal as being 
substantially fully voiced if a majority of the plurality spectrum bands are established to 
be of a speech signal of a voiced quality. 

Neither Rothweiler nor Nishiguchi '253 teach describing the speech signal as 
being substantially fully voiced if a majority of the plurality spectrum bands are 
established to be of a speech signal of a voiced quality. 

Nishiguchi '388 teaches establishing the voiced quality of the speech signal in 
each spectrum band (voiced and unvoiced portions are present in each frequency band, 
col. 6, lines 7-14), and describing the speech signal as being voiced by counting the 
bands that are voiced and making the comparison between the voiced bands and the 
unvoiced bands (ratio, col. 28, lines 54-63). 
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However, it would have been obvious to one of ordinary skill in the art at the time 
of invention to modify the system of Rothweiler and Nishiguchi '253 to describe the 
speech signal as being substantially fully voiced if a majority of the plurality spectrum 
bands are established to be of a speech signal of a voiced quality because this would 
simplify the coding. 

14. As per claim 4, neither Rothweiler nor Nishiguchi '253 teach the dividing step 
includes five spectrum bands. 

Nishiguchi '388 teaches at least 1 1 bands (Fig. 7A), ttos including five. 

Therefore, it would have been obvious to one of ordinary skill in the art at the 
time of invention to include five (or more) spectrum bands because it would enable 
more accurate perceptual coding by better approximating the ear's critical bands. 

1 5. Claims 6 and 1 7 are rejected under 35 U.S.C. 1 03(a) as being unpatentable over 
Rothweiler in view of Nishiguchi '253 and in further view of Simpson et al. (U.S. Pat. 
6,772, 126). 

Rothweiler teaches buffering the speech signal for a multiple of frames to be 
block quantized in subsequent steps (quantizer quantizes the signal after a three frame 
delay, col. 17, lines 37-40). 

Neither Rothweiler nor Nishiguchi teach the number of buffered frames of speech 
is increased during periods of substantially voiced speech to enable more accurate 
coding during the subsequent steps. 

Simpson teaches buffering the speech signal for a multiple of frames to be block 
quantized in subsequent steps (pitch is quantized in blocks of four pitch values, col. 41 , 
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lines 19-20), wherein the number of buffered frames of speech is increased during 
periods of substantially voiced speech to enable more accurate coding during the 
subsequent steps (pitch values buffered only for voiced frames, hence the number of 
buffered frames is necessarily increased, col. 41, lines 20-25). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler and Nishiguchi so that the number of 
buffered frames of speech is increased during periods of substantially voiced speech as 
taught by Simpson because vector quantization would achieve a lower bit rate for 
voiced frames compared to a scalar quantization, hence being more efficient. 
16. Claim 8 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Rothweiler in view of Nishiguchi '253 and in further view of Scott et al. (U.S. Pat. 
4,969,193). 

Rothweiler teaches quantizing using three bits per pitch value (col. 17, lines 41- 
45), but does not teach quantizing using two bits per pitch value. Neither does 
Nishiguchi. 

Scott teaches quantizing using two bits per pitch value (cof. 9, lines 26-31 ). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler and Nishiguchi to quantize using two bits 
per pitch value as taught by Scott because it would lower the amount of bits used to 
represent the pitch information hence lowering the bit rate and making the system more 
efficient. 
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17. Claims 12 and13 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Rothweiler in view of Simpson and in further view of Nishiguchi '253. 

As per claim 12, Rothweiler teaches a method of coding speech comprising the 
steps of: 

sampling a speech signal (A/D converter, Fig. 1, element 14); 

buffering the speech signal for a multiple of frames to be block quantized in 
subsequent steps (quantizer quantizes the signal after a three frame delay, col. 17, lines 
37-40); 

determining a pitch of the speech signal (average pitch of the signal, col. 10, 
lines 23-26); 

characterizing the voiced quality of the speech signal (determination of the signal 
being voiced or unvoiced, col. 9, lines 62-67 and col. 10, lines 1-9); 

implied training of a Lloyd-Max quantizer (col. 13, lines 30-32); 

quantizing the pitch values (pitch quantized by differential quantizer, col. 17, lines 
37-47) from the training step and the pitch values of those speech signals from the 
determining step not characterized as being substantially fully voiced in the 
characterizing step (determines the pitch for all voice segments hence the averaging 
would take into account all voiced and unvoiced segments, col. 10, lines 23-36); and 

the fully voiced speech signal is synthesized using a pitch periodic excitation train 
(periodic signals are generated by a pitch generator and supplied to the filter, col. 19, 
lines 2-8). 
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Rothweiler does not teach training the Lloyd-Max quantizer using the pitch values 
of those speech signals from the determining step characterized as being substantially 
fully voiced in the characterizing step. 

However, the Examiner takes Official Notice that it is common in the art to train a 
quantizer on the type of data that it would quantize and that unvoiced speech segments 
do not carry any pitch information. Therefore, it would have been obvious to one of 
ordinary skill in the art at the time of invention to modify the system of Rothweiler to train 
the Lloyd-Max quantizer using the pitch values of those speech signals from the 
determining step characterized as being substantially fully voiced in the characterizing 
step because it would give a method for quantizing the meaningful pitch value with a 
low error rate, hence improving decoded pitch value accuracy. 

Rothweiler does not teach the number of buffered frames of speech is increased 
during periods of substantially voiced speech to enable more accurate coding during the 
subsequent steps. 

Simpson teaches buffering the speech signal for a multiple of frames to be block 
quantized in subsequent steps (pitch is quantized in blocks of four pitch values, col. 41 , 
lines 19-20), wherein the number of buffered frames of speech is increased during 
periods of substantially voiced speech to enable more accurate coding during the 
subsequent steps (pitch values buffered only for voiced frames, hence the number of 
buffered frames is increased, col. 41, lines 20-25). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler so that the number of buffered frames of 
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speech is increased during periods of substantially voiced speech to enable more 
accurate coding during the subsequent steps as taught by Simpson because vector 
quantization would achieve a lower bit rate for voiced frames compared to a scalar 
quantization, hence being more efficient. 

Rothweiler and Simpson do not teach that the speech that is not fully voiced is 
synthesized using a lowpass filtered pitch periodic excitation signal mixed with highpass 
white noise. 

Nishiguchi teaches a method for synthesizing speech that mixes a lowpass 
filtered pitch periodic excitation signal (voiced sound filtered by post-filter, col. 10, lines 
40-44) with a highpass white noise (unvoiced section filters noise codebook vectors to 
combine with the pitch excitation train (Fig. 4, element 220). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler and Simpson so that speech that is not 
fully voiced is synthesized using a lowpass filtered pitch periodic excitation signal mixed 
with highpass white noise as taught by Nishiguchi because it would add white noise to 
voiced sections of the speech signal hence improving synthesis. 

Rothweiler and Simpson do not teach speech coding using perceptual weighting. 

Nishiguchi teaches the speech coder using perceptual weighting (perceptually 
weighted filter, Fig. 1, element 125). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler and Simpson so that the speech coder 
uses perceptual weighting as taught by Nishiguchi because it would lower the 
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quantization noise below the level of human perception hence making the system more 
robust. 

18. As per claim 13, Rothweiler teaches 

determining a gain of the speech signal (signal power, col. 7, lines 26-30), 

the necessary training step includes training a Lloyd-Max quantizer with the gain 

values (gain quantizer uses Lloyd-max algorithm for training, col. 13, lines 30-32), and 
the quantizing step includes quantizing the gain values from the training step and 

the gain values of those speech signals from the determining step not characterized as 

being substantially fully voiced in the characterizing step.(all frames are quantized, col. 

13, lines 39-51). 

Rothweiler, Simpson, and Nishiguchi do not teach training the quantizer with 
speech signals from the determining step characterized as being substantially fully 
voiced in the characterizing step. 

However, the Examiner takes Official Notice that using only voiced speech 
signals in training for gain quantization is common in the art; Therefore, it would have 
been obvious to one of ordinary skill in the art at the time of invention to modify the 
system of Rothweiler to train training the quantizer with speech signals from the 
determining step characterized as being substantially fully voiced in the characterizing 
step because it would be more important to get the gain right for voiced portions to 
avoid "zipper noise" due to volume variations. 
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19. Claim 14 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Rothweiler in view of Simpson and in further view of Nishiguchi and Rotola-Pukkila et al. 
(U.S. Pat. 6,732,070). 

Rothweiler, Simpson, and Nishiguchi do not teach the sampling step is 
performed at a variable sampling rate wherein the sampling rate is increased during 
periods of substantially voiced speech and decreased during other periods. 

Rotola-Pukkila teaches a codec that changes the sampling rate at times when a 
lower complexity or higher quality is needed in the sampling (col. 11, lines 64-67 and 
col. 12, lines 1-2). 

It would have been obvious to one of ordinary skill in the art at the time of 
invention to modify the system of Rothweiler, Simpson, and Nishiguchi to vary the 
sampling rate as taught by Rotola-Pukkila such that the sampling rate is increased 
during periods of substantially voiced speech and decreased during other periods 
because voiced speech needs higher quality than unvoiced speech, hence needing a 
higher complexity to represent the signal, to avoid artifacts. 

Conclusion 

20. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Zinser et al. (U.S. Pat. 5,097,507) teaches a system that uses a 
Lloyd-Max quantizer in a speech coder. Chen (U.S. Pat. 5,745,871), Aguilar et al. (U.S. 
Pat. 6,691,082), and Crochiere et al. (U.S. Pat. 4,184,049) teach speech vocoders that 
use pitch and voicing information for coding. 
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Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Matthew J Sked whose telephone number is (703) 305- 
8663. The examiner can normally be reached on Mon-Fri (8:00 am - 4:30 pm). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Talivaldis Smits can be reached on (703) 306-301 1 . The fax phone number 
for the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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